Using Vignettes to Measure the Quality of Health Care
Jishnu Das (World Bank)
I. Introduction
No matter how one looks at it—as differences across nations or as differences within nations—poor people systematically suffer from worse health outcomes than rich people. What role does medical care play?
This note outlines a research project that seeks to measure the quality of care, understand how quality of care varies by geographical location and sectors (private, public or Non-Governmental Organizations) and how (and whether) quality of care has an impact on health choices and outcomes. We discuss an instrument, vignettes, and a measure of the quality of care, competence, which focus on what doctors know, or, the maximum quality of medical advice that doctors could provide if they did all they knew to do. Vignettes are simulated patients presented to a doctor paired with an instrument that evaluates the quality of care provided by that doctor. Performance on the vignette is an indicator of the doctor’s competence, skill or ability. We show how competence can be validated and what can be learned by looking at correlations between competence and various attributes of the health-care provider. We propose ways in which this measure can be widely collected, at the same time arguing for (some) uniformity in cross-country studies to enable wider comparisons.
The note is structured as follows. Section II presents a prima facie case for (a) incorporating the quality of care in studies of the demand for health care and outcomes and (b) measuring the quality of care through the quality of medical advice that doctors give to patients, rather than (for instance), the infrastructure in a facility. Section III introduces vignettes as a measurement tool, describing how this data is collected and validated. Section IV presents results from recent studies; Section VI concludes with some lessons learnt, caveats and thoughts for further research.
II. Why and how
should we measure quality of care?
Numerous studies have documented the role of households in producing good health outcomes—children are healthier when mothers are more educated; rich households are better able to “insure” against health shocks; rich households live in areas with better sanitation and enjoy better nutrition. Based on these studies, the explanations for health outcomes among poor people have centered almost exclusively on household choices: either poor people do not use the health system as much as they should or if they do go to doctors it’s usually when it’s too late. However, recent work shows that even when the poor do visit health facilities frequently and often more frequently than the rich, their health outcomes remain dismal, the quality of the medical system must also play a large role in health outcomes.
Earlier studies sought to measure the quality of care through the presence or absence of a primary health care center, and found little or no relationship between the existence of a health care center and health outcomes. The lack of a relationship left many questions about providers unanswered: Was the lack of a relationship because the doctor was never there? Was the doctor qualified (holding a degree) and competent (knowledgeable)? The data to answer these crucial questions simply didn't exist.
The next set of studies tried to address these questions by using “structural” measures of quality; that is, quality alternatively defined by physical infrastructure, the stock of medical supplies, the total number of assigned personnel, the availability of refrigeration units, the availability of electricity or a combination of some of these (Collier and others 2003; Lavy and Germain 1994). Both studies found that health-care demand responded to structural quality—more people visited health clinics when the structural quality was higher.
A remarkable omission from these indicators is any measure of process quality, particularly the quality of medial personnel. If structural quality was well correlated with process quality, this omission could be explained because it is easier to collect data on structural quality. However, they are not well correlated and there is good reason to believe that process quality is more important than structural quality. First, structural measures such as drug availability are largely determined by the degree of subsidy and cost of transportation, making structural quality a predictable feature of owner and location, whereas process quality is more likely to vary within these parameters. To the degree that one facility is more likely to experience pharmacy stock-outs than another similar facility, it is likely to because demand is high causing misclassification. Second whereas both medicine and consultation are likely to be important to a patient’s health, households can mitigate problems with drug supply through purchases from other markets, whereas they cannot do this with medical care (see for example, Foster 1995).
We propose to measure the (maximum) quality of medical advice a patient is likely to receive when s/he consults a doctor and the correlates of this quality. This is harder to collect than structural quality since it typically involves either detailed interviews with the doctor and/or clinical observation of interactions between the doctor and a number of patients. Together with structural quality this research presents a more “complete” picture of the quality of medical advice
III. Process Quality
Evaluation Methods
Why use vignettes?
There are many ways to measure process quality, and these instruments, both actual and theoretical, vary according to their realism, relevance and comparability. A realistic quality measurement instrument is one that collects data on doctor’s activities in a setting that closely resembles the setting in which most care is delivered. A relevant instrument collects data on processes that matter in the sense that the observed activities are important to outcomes for a large segment of the potential population. A comparable instrument is one that collects data that can be compared across a broad spectrum of health care providers and settings.
In practice, some compromise across these goals is inevitable. For example, the fake patient (in which an actor posses as a patient and visits a large number of providers) is both comparable (the same patient visits all providers) and realistic (the doctor is unaware of the fact that this patient is not a regular patient). However, the fake patient is unlikely to be relevant because an actor can only pretend to be sick with a very limited set of illnesses (a headache or sore muscle, for example) and these illnesses are rarely the subject of our research. A fake patient cannot convincingly fake tuberculosis, for example. Direct clinician observation (observing the activities of doctors with their regular patients) is realistic[1] and relevant, but is not generally comparable because strong assumptions are necessary to compare the activities of doctors who see very different types of illnesses and patients. Vignettes – the subject of this note – are specifically designed to be both comparable and relevant because the same case is presented to every doctor and the researcher can choose to present almost any relevant illness. However, vignettes are not as realistic as other instruments because doctors know that the patient is an actor pretending to be sick, not a regular patient. We argue that this shortcoming can be mitigated through proper design of the instruments and proper interpretation of the results.
Vignettes will play an important role in investigations where relevance and comparability are overriding concerns. Research focused on the distribution and determinants of the distribution of quality on a regional, national or international scale must be both relevant and comparable and is likely to benefit from the inclusion of data generated by the use of vignettes. On the other hand, an empirical investigation of health seeking behavior in a particular population or setting will put a higher premium on realism than relevance or comparability and would find vignettes less well suited to the problem than other instruments might be.
What are Vignettes?
There are many different types of vignettes used in health care research today. The underlying element that connects these different instruments is the presentation of a “case” to the doctor paired with an instrument that evaluates the activities of the doctor. In some versions of vignettes, the case is a fixed narrative read from a script and in others someone pretending to be the patient acts out the case. In some types of vignettes, the doctor is asked to list the activities that he or she would undertake and in other types of vignettes, the doctor interacts with the “patient” by asking questions to which the “patient” responds. And, in some types of vignettes, the doctor is prompted by the examiner with questions such as “would you prescribe treatment X?”
The vignettes represented in Das and Hammer (2005) and Leonard and Masatu (2005) use an enumerator trained to act as a sick person rather than having an enumerator read from a script. The characteristics of the illness and patient are predetermined but unknown to the doctor. Except for the first complaints described by the patient, the practitioner must ask questions to discover the characteristics of the illness. Because the patient is not actually sick, physical examination is in question, answer format where the doctor explains what he is looking for, and the patient tells him what he would find. The measured quality of consultation is based on the doctor’s activities during the consultation. Because doctors know the patient is simulated and physical examination is done through question and answer, some realism is sacrificed. However, unlike some other types of instruments, the process by which a doctor examines and diagnoses and patient resembles the actual process a doctor would normally undertake. Thus, even though the doctor knows that the simulated patient is not real, the doctor is sitting at his or her desk and gathering information about a case in order to reach a diagnosis and prescribe a treatment. Our version of vignettes is as realistic as we can make it without sacrificing comparability or relevance.
How easy/hard is it to use vignettes?
It is impossible to measure process quality accurately without visiting all facilities in a sample and vignettes require that doctors be present at the time of the visit. However, once a doctor has been located, vignettes are relatively inexpensive to administer. However, the increased realism of our vignettes does have an important cost; enumerators must completely memorize a case presentation and be sufficiently well trained to adapt to the different questions posed by practitioners while still maintaining the same characteristics across all practitioners. This is much more challenging that training someone to read a list of questions off an instrument.
IV. Validating Vignettes
Do vignettes measure competence and is competence as measured by vignettes correlated with important underlying aspects of quality? To answer these questions, the validity of vignettes can be checked for internal consistency and the results obtained using vignettes can be compared to results obtained used more realistic instruments, in this case direct observation.
Internal Validity with Item Response Theory
The researcher simultaneously designs the case study patient
and the diagnostic and treatment protocol. In some cases (
Comparing Vignettes and Direct Observation
We have been careful to cast vignettes as a measure of competence, not a measure of actual practice quality. Leonard and Masatu (2005) compare the performance of doctors on vignettes and their performance in practice with their regular patients. This comparison suggests that performance on the vignette is a measure of maximum performance in practice but not a measure of actual performance. However, the vignette score and the practice quality score are significantly correlated over the sample; a doctor’s actual performance is closely tied to his potential or maximum performance. Despite their lack of realism, vignettes are a good first order measure of practice quality. That said, the differences between competence and practice quality are potentially important and Leonard, Masatu and Vialou (2006) suggest that the gap between ability and practice is partially determined by the organizational form of the health facility. Comparing quality within organizational type (public, NGO or private) with vignettes is probably more valid than comparing quality across organizational types.
Additional sources of validation
In
V. Some Results
Now that we have a measure of quality based on competence, what do we do with it? As a first pass, we can try to benchmark the quality of care, and provide some information about whether care is high quality or not. Next, we can try to look at difference in the quality of care, perhaps across geographical or income groups.
Results on the baseline quality of care
Despite the evidence that performance on vignettes is likely
to be an upper bound, the overall quality of care is low, although there is
considerable variation across countries and even within countries over time. In
Results on correlates of quality of care across countries
Though the quality of care is generally low, it is not
evenly distributed. In
In addition, there is little evidence to suggest that purely private health care is better than public health care. Although the private care available in urban areas and the wealthy areas of towns is generally superior to that available in the rural or poor neighborhoods, this is no different than the pattern in public health facilities. Private providers in the rural and poor areas are not superior to public providers in those areas (Das and Hammer, 2006; Leonard and Masatu, 2006b).
VI. Discussion
Do vignettes measure aspects of medical quality that
matter?
Clearly, vignettes do not measure everything that is important in health care and a measure of practice quality that was more realistic while retaining the relevance and comparability of vignettes would be desirable. Nonetheless, vignettes make an important contribution to knowledge because they allow some understanding of the distribution of competence, which in turn, is correlated with the distribution of practice quality. In addition, competence as measured by vignettes is not a function of location. The exact same case is presented to urban and rural doctors, doctors accustomed to treating poor patients and doctors accustomed to treating rich patients, etc. The differences in competence across doctors may be highly correlated with location, but they are not caused by location.
Other measures of process quality are more susceptible to being endogenously determined by location. For example, educated patients are more likely to argue with doctors or insist on providing information that might be of use to the doctor. Thus, a doctor who sees mostly educated patients is more likely to follow protocol simply because their patients either encourage it or make it easier. Whether allowing the education level of the patient to influence quality is a good thing or not, it is clearly caused by the patient mix not by the skills of the doctor. The vignette can avoid this problem because it controls for illness and patient characteristics. The research can choose to implement the vignette with an informative or an uninformative patient, but the same patient will be presented to all doctors.
The distinction between quality that is poor because of the location and quality that is poor at a location is important to policy. The results that we have found with vignettes suggest that lower quality doctors locate near poor patients. This is true, even within the public sector and suggests that poor quality doctors are sent to work with poorer patients. From the perspective of the patient, it does not matter whether a doctor learned to be good from his experience with other patients or whether he was good when he was sent there, but for the administration of the public health care system, this difference is very important and can only be uncovered through the use of an instrument like vignettes.
The importance of quality measures with standard errors
Process quality measures of any type are likely to suffer
from measurement error, and vignettes are no exception. Unlike many other
measures of process quality, however, we can approximate the error in our
measure of competence using IRT analysis. IRT analysis on the vignettes
collected in
It is tempting to look at the results obtained from vignettes and draw the conclusion that standard errors are too high. However, it is not the case that other instruments have lower standard errors, only that many of these instruments have unmeasured or un-measurable standard errors.
Lessons for the design of vignettes
The information scores show in Figure 1 give a picture of
the accuracy of the different case studies and their contributions to the
overall assessment of competence. The
patterns shown suggest that even for illnesses that are relevant and
comparable, some cases are more useful than others. The TB vignette in
Issues for the implementation of vignettes on a
international scale
As we have alluded to above, implementing vignettes requires
extensive training and some degree of adaptation to the local setting. The
questions that an actor should be prepared to answer will generally differ from
one country to another, particularly for questions that are not medically
relevant but for which answers must be standardized. Thus, it is unlikely to be
the case that an entire vignette manual could be designed that would work
anywhere in the world. In addition, illnesses such as malaria are very
different across different regions and would be difficult to standardize. However,
there are potentially important benefits to using the same cases across
countries. TB is the same in
References
Banerjee, A., A. Deaton, and E. Duflo, “Health care delivery in rural Rajasthan,” Economic and Political Weekly, 2004, pp. 944–950.
Banerjee, A., A.
Deaton, and
Barber, Sarah L.,
Paul J. Gertler, and Pandu Harimurti, “Promoting high quality care in
Barber, Sarah L., Paul J. Gertler, and Pandu
Harimurti, “The effect of the zero growth policy in civil service recruitment
on the quality of care in
Collier, P., S.
Dercon, and J. MacKinnon, “Density versus quality in health care provision:
using household data to make budgetary choices in
Das, Jishnu and Jeffrey Hammer, “Which Doctor?: Combining Vignettes and Item Response to Measure Doctor Quality,” Journal of Development Economics, 2005, 78, 348– 383.
Das, Jishnu and
Jeffrey Hammer, “Location, location, location: Residence, Wealth and the
Quality of Medical Care in
Filmer, D., J. Hammer, and L. Pritchett, “Weak links in the chain: a diagnosis of health policy in poor countries,” World Bank Research Observer, 2000, 25 (2), 199–224.
Leonard, Kenneth L., and Melkiory C. Masatu, “Comparing vignettes and direct clinician observation in a developing country context,” Social Science and Medicine, 2005, 61 (9), 1944–1951.
Leonard, Kenneth
L., and Melkiory C. Masatu, “Outpatient process quality evaluation and the
Leonard, Kenneth
L., and Melkiory C. Masatu, “Variation in the quality of care accessible to
rural communities in
Leonard, Kenneth
L., Melkiory C. Masatu and Alex Vialou, “Getting Doctors to do their best: the
roles of ability and motivation in health care,” processed manuscript,
Figure 1 Information by Vignette and Country
[1] Leonard & Masatu (2006a) show that the presence of the researcher can impact the activities of the doctor in a way that call into question the realism of the process, but these impacts can be controlled for.
[2] The
vignettes used in